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BACKGROUND AND INTTlODUCnON 

The invention rebtet genendly to &iilt tolerant computer systems such as 
loclcsiq> fault ttiterani comimtm which use moldple sobsystfmis ih^ 

In web locksiep £mh tolaant computer systems, tbc ouijpuis of the snb^^ 
aie eaniBied witWa the coiiipoier awl, if die oo^wts dife 
aetioa is taken. 

F«UK 1 of the aoeompaiqrii«dawiiigs isa schenatie overview of an exaiq^ 
of a 9pical sysum, in which ibiee identical processing (ppU) sets 10. 11. 12 cpeme 

b qmcfafonism (sync) nimer a conmoo doclc 16. By a proc£sdi« set is meant n 
subsystem inchiding a processing engine, for example a cental pnicessa« unit 
(CPU), and intanal state ston^e. Figuie 2 of the «coonqiaqyii« diawii«s is a 
schematic rei«esenia^ of such a process set This shows a processing ei«iae 
20. internal state $iot^(roemoiy) 22 and an it«etnal bus 23. The piooessu« set 
may include ocfaer eiements of a computer system, bat wiO not nonally liKhtde 
inpot/output interfaces. External cmnections aie also provided, for example a 
connection 13'fn»n de intetnal bus 13. an input 15 for dc external dodc 16 and 
hardware intenvpt inputs 14. 

As dwwn in RguTBl. the outputs of die three processing sets 10. 11. 12aie 
supplied to a fouU detector unit (voter) 17 to monitor die operation of the processing 
sets 10, 11, 12. If dK pcocessots sets 10. 11, 12 aie operath^ conectty, they 
produce idendcal omputt ID dK voter 17. Acoordii^ly, if d» outputs match, the 
voter 17 passes commands from the ptocesang sets 10. 11. 12 to an ii^ 
(I/O) subvstem 18 for action. If. however, die outputs fiom the pncesbg sets 
differ, this indicales diat someduig is amiss, and de voter causes some eonective 
action to occur before acting upon an VO operadoo. 

Typicatty, a corrective actitm fachides die voter supplymg a signal via to 
appropriate line 14 to a pnKesang set showiqg a feutt to cause a -chai^ge me* 1^^ 
(not shown) to be iOuminated on die. fanny processing set The defective processing 
set is switched off and an operator then has to replace It widi a correctly fanctioning 
unit. In the example shown, a defecdve processii^ set can nonnally be easDy 
identified by majoriQr voting because of die two-co^me vote dat vdU occur if oas 



(16) 



^^10-1 77498 



pracessh^ set «uls or develops a tempocaiy or pennanent fault 

However, the mventioa is not iimited to such system, but is also applicable 
to syaems where extensive diagnostic opeiadons aie needed lo identify the &uliy 
procejsiiigsei. Tte sysietn iwd iot have a aiiglc wer. and need not vote meiely 
I/O commands. The invcntioa is generally applicable to ?yncbronou$ ?ysi^ 
redundant cMnponenls «rfiich nu in lodotep. 

Lockstep systems depend on total synrhmnisation of the processiig sets that 
make up ttje finilt tolerant processing core. Accoidii^y. the pncessii^ sets iced 
hanlware whk* operates Identically, and. ta addition, die hiienial sBwed 
daa in die processing sets also needs to be identical. Pan of the process of 
integrating a new processing set into a tunning involves copying the comems 
oftbcmainmcmotyof aittoaingqrstcmtodieoewpiooessii^sct. Becausenaio 
memory can be very large, for example of die onler of gigabytes, diis process can 
take rather a long time in computing leims. 

Locfcstep convwer systems can go out of ^ for various leasons. Theprime 
reason isa fiiibire of a single processing set in a pennancA way. Recovery from 
such a ftihre normany involves mmoval of die failed unit, teplacemeat widi a 
functioniie oait and idnstalcmem of the functioning uniL dearly, die new 

piocessing set wiU have no notion of the contents of memory of a iwudis piocessing 

set. aal an of die main memory from the tuonhs systok wai have to be copied to 
the new processmg set 

Other, less ttaumatk: out-of-sync events can ofien be dngnosed aniomadcally 
by the iunnh« computer system and can lead to die awomatic tefat^giatto^ 
out-of-sync processing set Widioot its leplaocment. Forewmple. asofkdattenor in 
a dynamic memory, perhaps caused by a cosmic ray event, could cause a minor upset 
in operation diat could be fixed automatically. However, Oils has stiU required die 
rehaegration of die memory state of the out-of-sync ptocessing set. dat is die copying 
of the oontems of dc main memory from a tuuuog system to die out-of-sync 
processmgset. Accordingly, because of die main memory can be very large, diis can 
still take a Uxig time in computus leiins. 

The inventioaseeks to provide an automatic and rapid way of recovering from 
mmor out-of-^ events which avoids die problems of die prior an. 
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SUMMARY OF THE INVENTION 

In acconfamoe widi one aspea of ihe invention, there is provided a memory 
nBii^enientq^simfot6ultfoleram computer 9^ management 
system comprising: a first reomling mecfaanism which can be acdvaied to record 
memory iqpdaie events; a scoood reooidiqg mprhafikm haviqg a capacity m rtcotd at 
least a Umiied tBuaber of memory update events; a 6ub input for a fodt signal to 
aedvaie tbe first recordii^ mrrhanism in tbt event of a iisadt event; aod a memory 
leiniegration tnecbamsm to idniegraie at least parts of memory ki^^ 
and second recording mechanisms. 

Eknbodiments of tbe inventkxi cake advanta^ of the fact diat, after a ndnor 
ont-of -sync event between processing sets in a lockstcp system^ most of tbe memory 
contents of the out-of-sync processii^ set is initially idcntkal to dut in a lunning 
system. Only a relatively small nundxr of locations within die memory system of 
either die out-of-syoc processing set or die running system will have been modified. 
However, tbe divergence will mcrease widi time as the runnii% system contumes to 
opeiaie and execute its normal processnckxuL Embodiments of the invendon allow 
for the drvergenoe to be tracked aid nocounted for and, moceover, for ar^ memory 
iqidalB events anxmd Ok outH>f-sync event and before die first recording t«^>«m«f 
has been a^dvated to be caug^ 

Preferably, the recording of memoiy updates (writes) is not based on 
wording each address aocessed, but rather on memory sqsments Qi?ges) i^daied 
(wrinento). In odier words, the first and/or second reamUsgrnecbamsnis preferably 
record the segments (or pages) updated (written to). Tlus can be done eficctivdy 
usiqg a segment (or page) memory widi a bit per s^ment (page) for identifying die 
segments (pages) written to. 

In Mcordance widi antxher aspea of the invention, there is provided a fault 
tolerant comimter system comprisiis a plurality of synchronoos processing sets, each 
comprismg a processor widi intexnal memory and c^ieradpg in lockstcp, and an out 
of sync detector for detecting an out-of sync-evtm and for gcncratiiig an oot-of-sync 
signal, wherein each processirg set also comprises: a first recording mec±ani$m 
• ^it^ can be activated to reconlniemory write eveitts; a seoood recording mech^ 
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having a cqndty to record at least a limited 

input for reeding die out-of-sync signal to activate the fint itcoiding mcchanisra in 
the event erf an out-of-sync event; and a memoiy reintegration mechanisn to 
itinti^raie inan out-of sync {Hocessiqg set at least pans of memory identified in the 
first and S RConrf reootding neclianisnis. 

In accocdaoce widi a fortber aspca of the invendoo. there is ptovided a 
roediod for xonl^giatioa ^ a processing sec of a hak tolerant con^iiier system 
followiiQ a fault, wherein die fault coleramconfmier system comprises a jphicaGty of 
synchronous pro rrssirg sets, each comprising a pr ocessor and internal memory and 
operating in tockstqi, and a fault deoBCior for ^^^^nz a fault event and for 
eeneratire a fauk signal, the mediod oon^ii«: 

maintaining a t uupo t ai y record of inenif»y update events over a limilod perod; 
re^odbg to a fault to actrvaie a furdier record of memory update events followii^ 
die fault state; and 

performii^ memory reint^ration in a processus set in whkA a fault has occuritd for 
at least those parts of memory tdeodfted in the tenqNnary and further memory 
records. 

In an cixibodimenr of the mveniion a iccoid is kqx of at least sdcctol mcm^ 
access events (memory write events) to main memory after the out-of-sync event, so 
that only die modified naemoty locations need to be coped to reintegrate die out-of- 
sync processing set. 

In (me embodiment of the iinrendon. die first recording in^ hj inif in is a 
meoKKy management unit oanprising a RAM widi an entry fat: each of a plundi^ of 
Qicmory pa^^ a code beipg written to a page entry each tune diat page is m 
when the first recording inedianism has been activated. 

Preferably, the first recordii^ m ech an ism has an enable input connected to 
received die fault (out-of-^nc) signal. 

The first recording mrrhanism can re cor d an arbitrarily large number of 
written p^es, up to tiie total number of pages in the processing unit. 

The second recording met^ianism preferably maintatty^ a rollix^ record of 
recent memory i^xlate events iq> to a number sufficient to cover the time to activate 
the first recording ruechanism following a fault event 

4 
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The second recoxding mechanism can comprbe a ftist-io-fim out buffer, the 
first recording mechanism in one embodiment being connected to an output of the 
fust-in-first-out buffer. In this configuration, the first-in-first-out buffer stores up 
to a prcdetennined number of iqxiaie addresses, an address decoder can be connected 
to the output of dtae frrst-in-firstmut buffer to gcneiate a page s ignol rqirescmativc of 
a memory update address output from the Gnt-in-firft-oot buffer and the address 
decoder is tc^oosive to the out-of sync s^nal to pass the page signal to the first 
recording mechanism. 

Alternaiivrfy. tbe second recording mechanism can comprise a logic ataiyzer. 
This can reduce imptememadon costs as fault tolerant computers ^ically mchide a 
kigic analyzer fiir fiaiik analysis. 

Where die ouqmtof dtc second reconling mrrJunism does not form dg mput 
to the first recording m er hani s m , die operation of die second rccoidiiQ means is 
preferably inhibitrd in response to the out-of-^oc s^oal. 

The first recording mechanic can comprise a software generated table in 
which a record correspondirig to a page ctf" memory Is marked widi a code ^tdienever 
dm has bcco writteiL This roconi can be maimairKd by software which updates 
crttiies in die ttansladon look-aside buffer of the processor. The second recordii^ 
mechanism can be die contents of die TLB togedicr widi a list of pages xecentty 
flushed from die TLB. 

In response to an wt-ofsync inpst, software can search die TLB and the list 
for pages which may recently have been written and may mark lliese as wrto 
fiiat icDQidii^ mechanism, then condnue lo mamialn tfe fim lecorffitv '^'^n inn 
until die processing units are reintegrated. 

Prcfeiably, the memory leintquration mechanism is operative to reintegrate 
memory pages identified in die first and seeond recoidBiig medumisms. 

The Qiventton is applicable to a computer system comprisire fluee 
synchronous processii^ sets operating in lockstep. vdierein an out-of-sync detector 
determines an out-of-syoc processing set by majority voting. 

Indiiscase, reintegration ofanoutH}f-stq> processing set can be adiicTcd by, 
in response to the identification of an out-of-sync processing sel« selecting one of dK 
remaining two processing sets, supplyix^ an interrupt to the out-of-«ync processing 
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set and the remaining pn>cessii$ set co cause die out-of-sync aod itmainins 
processing sets to idle, reint^ming one of the out-of-sync aod remaining processing 
sets while maintaining a software log of mcmocy write events, and then rantegratn^ 
the other of die otit-of-sync and remainii^ processing sets using d£ software log. 

The invention described can reduce tbe reintegration time of a proccssir^ set 
in a iockstep fault toleiam computer from maiiy minutes to just Cractioas of a second. 
Durii^ die idnt^cation period, die cooqiuier B vubm^ 
lunnii^ processing set Thus the nductioa in letm^xation time has a significanc 
lienefit onlbe overall availablli^ of the computer. 

DESCRIPTION OF THE DRAWINGS 

An embodiment of the invention win be described hereinaAer whh reference 
to the accompanying drawii^ in which lite reflBreoce signs relate m lite features and 
In whkh: 

Ingure 1 is a schematic overview of a triple-modular-reduntani fiutb toierant 
computer QfStem; 

Figure 2 is a schematic repr es entation c£ elements of a pr oce ss o r set of the 
sysuaa of Hgure 1; 

Figure 3 is a schematic representation of a processor set of an embodnnent of 
the invention; 

Figure 4 is a schematic lepiesentadon of a memory managjcnient ^ ^t i 
Figure 5 is a schematic representadmi of an exanqite of a first lecotdiog 
mechamsm; 

Figure 6 is a schenmticryregaentationofanexampteofa secondary recording 
mecdsanumif 

Figure 7 is a schematic representation of another exantvle 61 a secondary 
mxudi n g mechanism; 

Figure 8 is a schematic rqsrcsemation of an exanqile of a combined first and 
secondary recordir^ mechanism; and 

Hgure 9 is a schematic representation of an alternative conf|gpraticm of die 
' exanqile of Figure 8. 



(21) 



1tB3¥10-l 77498 



DESCRIPTION OF THE PREFERKED EMBODIMENT 

Figures is a scfaenaiic block diagram to iq i i g mH demciits of an namplr of 
the tnvaaton ia fiinciiooal FigoxcSgeocrally icpicscntsoncofthcproocs^og 
sees 10/11/12 for a fault loleraiit compuier system such as. for exanpte, the system 
shown in Figure 1. 

In Figure 3, apcooessiQg ct^ne (e.g, a central pnicessii^ unit (CPU)) 20 and 
internal state storage (memoiy) 22 axe connected by an internal bus 23. Extencd 
connections are also provided, for example a riwm^t'^ ^ 13 fiom die imemal bus 23, 
an ii^Qt IS for an external clock and hardware intermpt input 14. 

Also shown schematically in Figure 3 are a Tirst lecocdtng mechanism 25 
which can be acdvafied to recoid memoiy update events, a second reconUng 
mrrhanism 26 bavfa^ a capaciiy to rccon! at kasc a limked number of mcmoij 
i^Kfate eveatt and a memoiy reintegration mcdianism to rdntegnie at least pans of 
memoiy i dentified in the fiat and second reooidii^ mechanisms. As shown in Figute 
3» each of the mechanisms 25, 26 and 27 is shown eonnecied to die bteroal bus 23. 
This Is because the first and second molding mechanisms 25 and 26 need to nxmltor 
memory a cces s events to identify when and where meniofy upflati^g occurs (m em ory 
writes). Also die reinti^;raticm mechanism 27 needs to s rcfs s die first and second 
irr ording mechanisms to determine where memory writes have oc cun ed in the 
memories of out-of-sync and running processus sets and then to copy corresponding 
memory portions firom the running to oat*of-qaic processor set memories. However, 
the me c han is m s 25» 26 and 27 can be implemenccd m various ways as will be 
cxpbdned in the foDowtng descrqttion. Various implementations inmlvc different 
combinadons of hardware and software and intereonneetion of the various H^w*^^ 
win tyincally differ from that lllttstcatedml^gure 3. For example, the rehitegcatton 
mnrhaidsm will typfcally be rnqdemented In software* and may be iraplemeiaed in a 
conmrfcoinputcr associated vindi and/or formxie part of the voter 17. Also, dse first 
tecotdiiv me c h a n i sm 25 may not be oonnected directly to the bus 23. but my be 
c on nected via the second recording miyhanism 26. Also the first and second 
I recording modianisms may be Implemented to a greater or lesser extern in software, 
as will be described tereinafier. 

Conqaiter systems typically «ri?de memory mana^emeoc hardware to keep 
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track of and control the use of main memoxy. It is also usual to divide memory into 
pages of specified size and to keep a small record of access controls to each pa^e. 
Hardwaze m ec h ani sm s also exist for updating a ncord for a page with a bit that 
Indicates that die page has been modified. Tins bit is die scxalled *dirty' bit for a 
p^c. A page of memtxy is calbd *clean* vAm no writes have been made to it 10 
chai^ it fhra its imdal state, and *dirqr* after such a write hashes Softwaie 
can cause a page tt> be marioed 'dean* by clearing die dir^ bit for dmt page m die 
nemoiymaiagementuiitcecocd. Haidwaie will later set the bit to 1 fioindicaie that 
the page has been written to. In normal operadon, many pages (^computer memoiy 
> will be considered by tbt memoiy management unt to be ditty most of die rfm^ 
Aooordingly, if a coavcntxoiial mcoioiy man^gffiKiit unit opetadi^ m a conventional 
manner is pn>vided m each of die pnxessii^ sets of a lodcstep fndt toleramomi^^ 
system, it is dnis lOcdy diat mai^ pages will be marked dicty when an out-of-^ync 
event occurs. 

i Because the memory management unit of a convcntioaally configured 

comptner processing set is usually under the control of the running operam^ sysuan, 
In a first embodiroettt of the invention an additional memory management umt is 
provkied purely for the use of die software whbh reintegrates procesang sets after 
an oot-of-sync event 

I F^ure 4 diows a conventional memory managonent umt 40 which has been 

custonused 10 include only infonnation on windi pages of memoiy are dir^ and 
virtiich are dean. In the follomdqg descrydm this type of aaemoty m an ^ement unit 
Is termed a 'dirty ram'. Software 42 may access die dirty ram storage 46 to cbedc 
which pages are dirqr, and can write it direcdy to diai^ dK status of a page to dirty 

I or clean. In addidon, hardware 44 automatically chaxiges to 'dir^' the state of die 
record for any page of main memory wfakh is w^ritten to via die bus 23. In dns 
embodiment only one bit of dirty ram storage 4$ is used for r?^ endre page of mt^tn 
memory. It b not mxssary that the size of the 'pages* nxmitoicd by the dirty ram 
is the same as the size used by other memory management units in the system, but 

) it is often bodi convenient and efficient diat die pages all have die same size. 
Confaiten tend to work in pages and a write access to one part of a page often 
in^dies that ddierlocatkinswidiin die satne page However, a 

8 



<23) ^¥10-177498 

u Q tjvfcn tioml lu e moi y managoneiit nmt as shown in Fipm 3 will ncC m itself be 
sof&acot ttflmplematf ibe Mic in faud bocsose most of cbe paces vcc moaQy i&ty 
as described abawe, 

FigiM 5 Is a r-^***"^^ btod: diagmn of a lam SO for a fitst 
T'Th^i*"*^ or invcaition. In Fiance S» tbe diity im 50 is piovldcxl widk a 
yf;pftr<>»*^ "Mptti ^ ^>tfberf>y ifae haidware 54 onfv begins to lo^ dirty in 

tiie dixty nm socage 56 afkr Ibe piDc c ssiig sets have fooe out of sync Tbe signal 
cm tbe enable input is assarted in l e sixaoc to the dctectiop by die vcner 17 of an out- 
of-^ync event. 

Hue ditty nin enable input 58 allows die opcmtiuu of die diiiy lam system. 
In iifffTp»T m>LiBti on» witb fwww t Miiig sets tunning in sync, ilie dii^ sank cnaUe iiqnit 
is assexted aut tte diny sani 50 is set Iqr the aofbvare 52 sudi dnt all pag^ 
Sivcn *clean* sotus. 

Wben an cxtf-of-sync even! occars^ tfie eoaUe inpiit 58 l i firomrs assoted. 
Wlnle die enabfe inpnt is assexted G.e.. wbtle ihe |KOoess]i« 
ifirty nun logs dv pages of sssin memory written to. The pages which art wiiiicu 
towinbetbosewiridipoiwially (fifttartmiteiunido^ 

sets. A dirty nm with an enable it^pttt as in Figure 5 is provided in each processing 
set and is connscted tbexe to the respective system bos 23. While the processing sets 
are running in sym, each dirty ram b held in die clean state by tbe software 52. 
When it is d et c rtrd that the prooessiog sets arc running ool of syne tbe dir^ cam 
lagging is eoablfid. In this emb odimmr , a hardwaae cnaMe signal 58 is g e n e rate d in 
die o«-off-syncdetecaionh aidwai c (to., flic voter I^ 

lets 10. 11, 12 are ont-^f-^mc. In odier awnris, wheuew tbe voter detects a 
diffe r ent in the ouD^iat from tbe pr ocr wing sets, it generates a signal wbicfa is 
jff im Hgrf to each j ^n^rv^^vz ^ to form the asserted oat-of-sync sigoal. Once 
asserted, dv: oot-of-^yxsc: signal is not nested uctil the p i oc ess ii ^ sets have been 
rdiBCatad. In olto embcsdiDieids, ttie enable Ugn^ conid be gexnated 1^ 

After an out-of-aync event, software and/or liaidwaxe nrrbanisms act to re- 
coiifigttre the tedr tfiiaam conqpuicr ayseeni. The a^ysaem curies oninnning normal 
cperataopswidint least one processing set- At least one pr o cessto g set> indudir^g the 
out-of-syrc p tuoe ssi o g set b tatoenoot of operation . Ibb out-of-sync processir^ set 
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thereby stc^ luaniog nonnal pcratiom aixi wiits to Ix: raatc^ratcd iiuo the nxnn^ 
sfsUm. Memory writes cbme on the ninniiig and the out-of-sync processifls set 
produce divetgence in die main memoiy conlems in die iunnii« and out-o^syoc 
pxDoessi^g sets. 

When software on dK mnning system conies lo fdns^ra&e die out-of-^oc 
ptooessii^ set, it arrrsscs die diny tun on tbc lunniqg system to find pages of 
memoty that have been diitiod since the out^f-sync event. It also accesses die diity 
ram in die out-of-syoc pcooessics sec This ditt/ lam tells wfakfa pages have been 
inodified by the om-of-syiic piocessor(iO ^DO^ die div^ Ifupsg^^f 
raemiKy is memfamed as 6itty in any of die ditty lams, on the tunniis and out-of- 
syoc processtie sets, it has to be copied bf die idniegratkm softwate lo bring die 
out-of-^mc processii^ sets bade into sync If a page of menmy is not marked as 
dirty in any diny ram, it can be ignored, as it will sdU be cocmt on die out^-^ 
processing sec 

In an altenmive embodiment of die invention, if die processing sets have a 
diny ram aorc widi no enable pin, opeiatii^ all die time to log diny pages, software 
could be activated by a hardwaic signal on die cxit-of-sync cvcmto dean out die diny 
nun. This softwate must carefully note any pages which it itself diides durii« dc 
deaniqg process. 

In yet another alternative embodiment, an oniinaiy memoty management unit 
can also be used to collect die ditiy page infonnatkm. In dns aloeniati^ 
embodiment, software is ananged to modify the page tables at Ak out-of^sync event 
soduitanpagesof main merooiy are write protected. This roeaiis dnt write cycles 
to memoty will result in a bos error exception to die piocessor The processor can 
dien act on each bus error firet ID add die written peige to a software^^^ 
of diity pages, then to remove die write protection for diat page so duu fuoiie writes 
there wiu conq)lete normally. This has die advantage dut only a single list of duty 
p^cs need Itc examined by the reintegration software, widi no <yarritiiig diroiigb 
dean pages to look for occasional diity ones. 

It should be noted, however, diat it is desirable to provide a separate 'dirty 
memoiy* rather dian to use using omvendonal memory management units widi 
addition a l software to collect dirty page infonnation. This is because the use of 
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convciuwnalinmory management units suffers from Firstly, die 

convcmional conqniter operating syston software may be uang the memoiy 
management unit forks own purposes. Secondly, convcnUonal mcmoiy managemcat 
units wiKt for ocdy a single processor and cannot cover multi-piocessor operatioi or 
direct acmory access by I/O devices. 

Whatever nttliod bused to collect data on dirty mcmwypa^^ 
likely to be jnoblems near die ool-of-sync event. Some time has to elapse between 
Use detection of the out-of-sync event and the enabling of tbe dirty lam dseta 
conectbm. and a few diciy pages nay go unrecorded in dus period. Exactly how 
many pages are missed depends on die implenBentation of die dii^ ram. but even a 
single missed p^ is enough to mate useless tfK scheme of cc^i^ 
an« of main memoiy afier an out-of-syne evoiL 

Accordingly, m embodiments of dtt inventioo, a n,^^^ is also pio vided 
for recording memory write events around tbe out-of-sync event 

The mechanisms described above forpnyvidmg a record of dir^ pages are put 
mto operation starting at tbe (ftit-<>f-sync event, and can reconl a^ 
fotlowir^thateva^ However, to complement fliis a separate, lenqioriiy. record Is 
required for pages diitiod close to the out-of-sync event This separate record has to 
take accoum of write events over a limited time, preferably on a roUiugbas^ Hiis 
separate record needs to have a ca^jadty suffcciem to accommodate wrke events w^ 
might occur between die out^)r sync event itself and the time die m#>f>h»tijirniy 
described abow can sian recording. Hiis separate record is called tie secondary 
dirqr page reconi m die foUowing description, to disdt«uisii it ftom 'dirty lam" 
alceady discussed above. 

The secondary dirqr page record has to be operatiic continuaUy (at least until 
an out-of-sync event occurs), because it cannot be predicted when a fault mm send 
It out-of-sync. It b die job of tfK secondary dirty page reoocd to lemember diose 
pages^ dirtied just before or after an out-of-sync event dot may not be (ffbperly 
collected by the dirty ram. The secondary dirty page record also has to have Vrnijh^ 
time memory. If it remembered pages dirtied indefinitely &r in die past, it would 
eventually list aU memory pages as bcii« dirty. It shouW only remember 6r enough 
* bad: past the out-of-sync event to ensure dm divergently dirtied pages whfch the 
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primaty diity page store caanoc catch are uiclucfedL 

lb some cmbodimeiits described below, wh^ rccoid 
cpenies in parallel with the diity ram, the seoondaiy page rccoid is frozen at or soc» 
aflerdieoui-of-s]riic event Indieseentedimera. ifittmlcftnimui%.dK^^ 
lime nature of the record oould eventually allow the important information Aoui tbe 
o«t-of sync event to be overwritten or tost This can convemesdy be done by 
respoodhig to die assening of die diny ram enaUe signal lo inhibit dK openoon of 
the scooodmy page icoocd. 

Once the secondary dirty page record has been frozen, eidier software or 
faanhvate can examine it and cause dirty pages listed diere to be added to die pr^ 
dirty ram, or to a sqxarate lot for copying by die tdni^rationsoft^^ Bcxh oot-of- 
sync and runnir^ processing sets have secondary dirty page records. Software can 
examine and compare die records and deduce ^i^iich pages weie actually dirtied ia 
syncby die processing sets, if needed. This wiD decrease die ixmiber of pages to be 
copied. 

In one e m bodime nt a logic analyzer h used to collect iofoimadon for the 
secondary diity page record. Rgore 6 diows a ksic analyzer 60 ofasecvuig the bos 
23 of a processing set A logic analyzer 60 with n trigger "^'^nim 66, dock 
qualifier 62 and address geiKiacor 64, is provkted for each processing set. The logic 
analyzer is usually running. The analyzer 60 is triggered causng data collection to 
stop, by die assertion of the processing set out-of-sync signal, which same signal 
starts the primary dirty ram collecdng data. The logic analyzer evemusdly stops 
operatiQg and keeps a record of compoler bos operauoa both before and after the out- 
of^sync event By analysing die logic analyzer tratxs from die out-of-sync and 
lunnir^ processing sets after an out-of-sync event, it Is possible to deduce wfakh of 
die stored naiwartions is a divergent write cyde. The relevant pages can then be 
added to a set of pages written to in a softwaicHnaintuned secondary dtr^ page 
record. The logic analyzer needs to store at least die address arid control irtformaiion 
for each bus cyde so that pages written to can be determined. The analysis of the 
logic analyzer outputs can readily be effected by softwaie rondnes. 

The advantage of usiqg a logk analyzer for the secondary dtr^ page record 
is that lodcstep 6uk totecaot coirputers will typically have logic analyzes^ built in, 
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and triggered on the out-of-sync cvtot, for fault diagnosis. It is then merely 
necessary to provide the software to control and analyze the ouqmt of the logic 
analyzen as described above. 

As an alternative, a write buffer can provide storage for the seocmdaxy dirty 
page iccord. Figure 7 shows a fiist-in-fustsrat tocmoiy 70 used as a sborHeim 
buffer, for vrrkes lo main memory over die imcmai bus 23. In nocnal in-sync 
operation, wihes to toatn metDOiy are decoded by write decode 
number of each write is written mto the FIFO 70. When the oot^f-sync occurs, the 
hardware out-of-sync detection signal 38 inhibits Anther writes into die FIFO 70. 
Later, software can examme the FIFO 70 contents to add pagjcs to the dirty pagie lisL 
The advamagc of the write buffer for a secoixiary dii^ page record is that it b 
ampler in both software and hardware than a logic analyzer. 

la furdier alternatives, the write buffer can be arrat^ in series widi die dit^ 

ram. 

Figure S illustrates a fust example of a combination of a fust and secoixiary 
recording mrr l eu i Hin . ^ibm a write buffer b arranged in series with a diny ram. 
The arraqgemcot of Figure 8 is based oa a combtnadon of the arraiigemciits of 
Figures 4 and 7. In tfab case, a FIFO buffer 80 is cqierating coofnaially to record 
write events to memory for a processing set widi die wrte events beii^ decoded 
condanally by the write decode logic 81 to siqip^ p^ addresses for storage in die 
FIFO80. In tbe present case it is not tKoessaryfbrlfae write decode logic 10 receive 
an inhibit input as will be described later. Page addresses supplied to die FIFO 80 
appear, after a delay, at the ou^nit of dc FIFO 80. However, they are pte v cn t e d 
from beit^ passed to die dirty ram storage 86 by nram of the gate 84 unti^ 
is enabled by an out-of-sync signal on die Coe 58. This out-of^jrnc signal eRectivety 
provides an enable signal for die dirty ram as it then erables the page addresses from 
the FIFO 80 to be supplied to die dir9r2m storage 86, whereby appropriate page bits 
can be set. Software 82 can be used to clear die dirty ram sunage 86 at aiQr time 
prior to an out-of^sync event so ttat it is *dcan* wfaea the out-^f'-sync signal is 
supplied. 

Inthisendxxliment, it is not necessary to disable the FIFO buffer 80, as die 
coments of die FIFO buffer 80 ate automatically smred hi the dirty ram after a dme 
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depm deat on ibe me f (be IIFO 80 aad fbe fequenqr of ilie write eveins. Ittibis 
cintw w r i ntc nt tbe itxnttgfatkm mrrhanism poefenbly ukes loeaiiat of tiie Arty ram 
86 and the FEPO 80. 

Rl^nic 9 flONttlnlcs n tww til f winiplc of ft combinstiQii of i Gist and ttcooSxty 
noonDng mrrhinfaau wbere a wiioc taflfer is wtmgibd id scxics widi a dit^ lun. 
In llus cacan|de. a HPO bufiCer 90 is openlxng corttiimtny Co monl write evems o 
inenuiiy a proccKins sec Hk oatpot fhnn die ru*v buffer 90 b snpplkd to aa 
address decoder 91 whidi is only ciaWrd in rcqxmsc to tte oaKfTsync eoadxle sxgnal 
58. When tbe aiSdicss decoder 91 b not enabled* te fxatsfot of the FIFO buffer b 
cflEbctMtf discarded. Oi^ «bea die address deeoder b coaUed aie dbiy pap bkc 
ott^pntliom tfgaddiess d e cod er 91 tor storage to die aiq ^ 
do^ m stotaee 96. 

OpcnmUljr. tfac out-of-sync enable signal is also supplied to the dirty ram 
itself, ahhougb it wiU be appxcciafeed tbat supplying tfac out-of-syoc signal to the 
addicss decoder cfibctivdy ptovi des an enable signal for tbe disty ram. As for die 
F^guic 8 caeample, in ihb embo^naat it b not neoessaiy to disable the HFO bofl&r 
90. as due ccnifieots of the FIFO boflfer 90 are a ui o matka lly stored In (be ditty mm 
after a tune depeii J r n t on tbe size of tbe FIFO 90 and the feqoency of die write 
events. In dib rndiorfinimf die tcintcgatian mcrbaoltm picferabhf takes aceonnt of 
die dit^ nun 96 and Ibe FIFO 90. 

A fill da I . y^ftwaggy^^^iViiiffHBpi!! cndwdimciic majors vac of a table lodfe^side 
buffer (TLB) and an smwiatrd TLB miss routine^ idos a dii^ page store wbich b 
dOEtasd in ntain mcnxDcy . A TLB forms a standard part of most con^nter addicssii^ 
srhf mrf using p agnrt iiyna iiy. Some ccanpntcrs malhiala TLB entries with a 
softwaxc TLB miss routine, instead of fixed batdwaxe. In tbb cmbodoDcnt. and 
following a trait event, s ut t wai e can note the TLB entries cunent^ speciQring 
writable pages, and add dKise to the soft dirty pa^ Store. Software can abo tiansfer 
BO dds a list of writable pages re cently flusbed from the TLB. Tbb list b maintainrd 
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fay fltc snss luutinfc lo ivxnial ciccmnAanccs, before the fanlt evccl, and ^ifrtin 
pa»^j^iichmifffat have been written close iQttM 

from tbe TLB. FoUowiqs thb, while mtOB^aSSm is in progress* sofkmre in the 
113 miss loitttne sdds eacii page wrtan to die soft cfirty page store, in this 
a recofd of dtity P^fS^ ^ cresHBd. 

The j^ytoich e s dcKcibed idbovc can be applied io a trfplcHBiodukr-fcdnadant 
(TMR) srsteo. Some TMR lodcstqi systems stvioeh to nmoii^ wiih n single 
pc orrt<i iig set m oaft-af«syDC event ooouts. This tei|uires two f w iwi i iite 
l y ^^ i ^i^ l^ ptjoM p ffcay^ fp recover* 

An exw^le will be described where a TMR system is nnm^ 
rets 10. It «nd 12 In syne. Id this escttqpie. leistfesntion Is petfinsaed by soaware 
nnder ibe contxol of a cmixol compnaor fonniqc put of the vottx 17. 

For this exanple, itisnsmmttllKretbatprecsessiiigset ISsufretsndnmsoft 
esxor whkfatalQes tbe system out of syxic. The voter detects the oac-of-^Tncevaz and 
arintzttxily chooses preoecsing set 10 to cany on and idles pi x xxsaUug sets 11 anl iz. 
Bach of piQoes^ sets 10. 11 and IZhasltstiwnpnmaxy run ant seoondaiy 
dirty pa^e lecoid. all captnripg the diffoeuialty dirtied data uux^ oot-of-sync 
event. 

To reiflEtegiate proeesang set 11, an of tiie pages n r»^ n'!^ i r4 as dirty in 
processing set 10 dir^iam. processing stf 10 sccoiMlarrdifTypij^ ^m ^-^^c^iw^ 
so 11 dirty ram and processing set 11 secondary dirty page lecotd are copied ten 
pr o Tfttiug set 10 to poooessing set 11. 

Then, to rein t eg n U c proceasing set 12, tfag pagr* T^mfj/My^ if dirty In 
pcoeessfa^ sec 10 diiv tarn, piinssing ret 10 BccoBfauy 

set U difty tarn and ptooessiic set 12 secondaty dirqr F9gc tecard are o^ied 1^ 
p t w ril i ng set 10 to proccsstxQ; set 12. 

IXiring the l e in te gratk m of ptmrtting set 11, the pcocessiz^ set 10 bas m 
cootititie to iccoid die diriyiog of pages. If tlie processing set 10 dirty page tarn ts 
inoperative dniog sooae part of this process, software must wnf^in a Mp a t ^ list 
of pages being dirtied to add to the list of p^cs oopM to proce sstog set 12. 

There has been described, tf i er e fare , eodMsdinvms of die mvosion in vrfddi 
tiiere ire provided: a primaiy dir^r tarn, for example a '^s^r*^ mcmoiy 

15 



(30) 



^¥10-1 77498 



nunagemem unit or a conventt nal memoiy manasemeat unit with control 
mechanisms; a secondary dirty page lecord which records a limited nomber of write 
ewoB ID memoiy aiouol the out^f-sync event, sufiicient to capture all diverjem 
writes imril ihc primaiy diiQF nun b enabled; and a mechanism, other haid^ 
softwiie. to Stan the primaiy ditqr nan itcordii(g the dii^ of pages shoi^ 
an out-rf-^yiK ewit. and where apprivriatc m Stop die seconda^ 
By examnuqg the primary dirty nnn and secoodaiy ifirty pa^e lecord of two 

processii« lets aadcopyii« the pages of inemoiy mentioned as dii^ maiv dii^ page 
record to the oot^jf^nc processing set the system can be leiwtjt^ 



Although particular onhnrti m rnB of the iiwention 

it wiU be appreciated diat many rnodifaations aiid^oradditte 
the spirit and scope of the present invntioo. 

Foe exam{de. various combnations of the first and secomi reoordii^ 
"^'""'^^ described above may be provided. Also die various eienienis ami 
techniques described above may be impiemeated using appropriate hanlwate or 
software technology. 

Although a spedftc example of a IMR system has been described, die 
uivention is not limited tbeieto. Moreover, odw mediods than majority voting can 
be used to identify an out-of-sync processing set. 

Altfaoughpartlculare mb od ftH enSofdiefawentiflQshavebeeadescribed. U wiU 
be appreriaied that the invention Is not IhniiBd thereto, and maiv modificaii^ 
addiliotis may be made wiiWn the spirit airi scope of flic immrioB as defmrf 
appended Oahns. For namide. diEferent condnnatiaas of the features of tfe 

dqiendent Claims m^ be combined with flie fiamres of the mdepeiident Oannfc 
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WHATI CLAIM IS: 

1. A nenxny maiBgemcnt system for a fsab loknm oonqmter system, sakt 
memocy imingcgicxtt siystem comprising: 

a fkst reooidh^ mcdhanism which caa be activated to recoid memocy update events; 
a second leconlinig mrrhsniwi havn^ a cspact^ to leoocd at least a Hmitfid nomber 
of memoiy update eveois; 

a fank input for a fault s^inl to activaie said first reootdii^ mccfaaiiism in the event 
of a &utt event; and 

a memocy reiraegraiion mechanism to leintesraic at least pans of memoty M tn tifint 
in said first and second reoonii^g mechanisms. 

2. A memory management system acoocdii^ to Cbim 1, wherein said fiist 
leoordiqg mechanism is a memocy managemem unit comprising storage having an 
eitfcy foreadtof a phnHqr of memocy pages, a code being written to a page entiy 
each time that page is written fo when said first recording "i'-*^*"^^ has been 
activated. 

3. A memoty management system according to Clabn 2, wherem said ficst 
recordiog mechanism has an enable input connected to receive the faulc signal for 
activating said ficst recoiding medBnisnL 

4. A memoiy management system aococding to Qaim I, wfaecein said secocxl 
leoording mechanism maimains a record of recent metnoty update events up lo a 
niinter sufficient to cover die tune to activatie said first recording mechanism 
following a ftuk event. 

5. A memory management system aooordiog to Claim 4, wfaerun said second 
reGordiag mechanism comprises a first-u^first-out boffier. 

6. A memoiy managcmem ^stcm aococding to Claim 5. ^Kicrein said first 
recording mecbanism is connected to an output of said first-in-first-out buffer. 
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7. A meoioiy management system acconUcg to Qaim 6, whereto said fii^^m- 
fust-oat buffer stores up to a picdeteimined number of update addresses, an addxess 
decoder is eonoected to saui output of said Grsi-iihfirst-oat buffer to generate a 
signal r^resentativc of t OKfnory update address ooiput from said fixst-in^ist-out 
buffer and said address decoder is rcspoosiifc to said fault signal lo pass said pqge 
signal to said first rccoidii^ medumism. 

8. A memoiy management q^stem arronfing to Claim 1, wbereia said second 
lecordiis mecfaanism c om pr ises a logic analyzer. 

9. A memory managmi e m system aoconlipg m Claim 1, whctciasaid second 
reoordiiig mechanism maintains a recoid of rec en t memoiy update events up to a 
number sufficient to cover die time to acdvatt said ftrsi recordiic n y *'h»ni^ 
foliowiog a fault event, tiie operation of said second rccotdiiig means being "**T»Wft^ 
in response to said fault signal. . 

10. A memory management system accoidii^ to Claim 9. wiKidn said first 
rooording meciianism cximpciscs a software generated list of update events. 

IL A memory management system according to Cbdm 1, v^ierein said second 
mxmfing medmnism comprises a table look-aside buffer and a memoiy access table 
Bwmfiifflffd In mffnin memoty* 

12« A memory mana^gemem system aoooRlit^ to Claim 1, wterein said memory 
reintegradon medmnism b operative to reintegrate memory pages identified in said 
first and second recording mechanisms. 

13. A fault tolerant oompincr system comprising a pluraliQr of synchronous 
processing sets* each comprisii^ a processor and internal memory and ppcratir^ in 
Iockstcp» and an out-of-syrc detector for detecting an cut-of-sync event and for 
generation an out-of-sync signal, wherein each processing set itcludes a memory 
management system according to Claim 1. 
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14. A (olenuit touyu tcr system ccamprfwing a p\aa]ky of tyncbtcnoiis 
ptiji^y ,T.y ng «as« cftcb oooopvismg a p rux ism and ioaenal noiKKy aad o p c uiiQS. ia 
lodalep^ and an oatij^-^tfnc dctixtof Ibr itcfwtii|g an ooMf'^fiic cifUt imd Cor 
p Hf i»rw i£ mm out-o^^yoc figpal, wlxicJu cirfa fauLtssiiis set also comprises: 

a Gcst reconluis in T ^^»*n i«w wlxicfa cm be activaiBd to iccocd n mumy write events; 
M yf*'<*T^ focaxdiog modisiiisiB having a tipitity to tcooid at least a lin)hr»l mnfihw 
<if mexxxHy write events; 

X tedt u^ax for raoesviiie said oufc-of-syac sisnai to activate said first reeordsig 
1 1 *^ Iwri™ in Ifae event of an oat-af^^ eveiv; nd 

least pans of netoofy jdrnflfind la said first and sceond lecocdius necfaanisins. 

15. A fanlt cotetant rmnpafer system a ccu i dii n ^ id Claim 14. whcgeln said first 
ic coi diogniBc3iainsmis« nif iHfu y ipaDQgemcntmiit c<anptising a RAM wLiU an entry 
Ibrcacfaof apbtraliqrofinDODQiy mcs. acndebciqgwximtB apagecaoy each 
rtmg that pace is written to win said first zecavdiag tiwt ImhIsiu faas been actrvsxed. 

,16. A fodt'tolerant con^mer system maoaxtSng to Qaim 15. wherein said first 
tecordtng niocbainsm bas an enable input mnnrrfrd to receive the out-of-sync signal 
Ibr activating said fiist lecocding medsunsm* 

17. A Ikalt tolenadconipaterqrsaeta jfrniriti^E m Qaim 14, wherein said second 
j pp tf - n f rf ^^ mechanism hP'F^*'"*^ a record of leuud inftnuoiy tipdasr events 19 to a 
iff nT'fiw sufReieot to cover die time ID activate wd first t rr^Hdloe n^fhw i uiiff 
following an out-of-^yt»c event. 

18. A faaktoknntCQiniaiter system aceofdit^ to Claim 17, whczei^ 
i B Cor ding nedianism ctunprises a fitst-inrfifst-ont laificr. 

19. A fanlt tolerant r^^^**^ system mording to Cladm 18, adnexn said first 
icoofdistg mecliaEusm is mnnrrTrd to an output of said fiist-in-firat-out baficr. 
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20. A Club tolenmtcoinputer system iccotding to Claim 19, wherein said fiist«ia- 
fint-out buffer stores up Co t predetennined oumber of update addresses, an address 
decoder is connected to said onqnit of said firsc-in-Tusc-ouc buffer to genenue a page 
agnal representative of a mcmoiy update address ouqxit from said fusr-in-first-aui 
buflfcr and said address decoder is responsive to said out-of-^nc sigtai to pass said 
page signal to aaid firtt recordii^ 

21. A fauh tokraac computer system •^rrmtigg to Claira 14, wlierein said vm nd 
Rcordiog mechanism comprises a logic analyzer. 

22. A fiuilt uHerant c<miputer system accocdii« to Claim 14, wherein said seoood 
recording mechan i sm ma i ntains t record of recent memory iqsdaie events up to a 
mmiber suffictem to cover die time to activate said first reoonfing m^-»^ic^ 
following an out-of-sync event, the opcratioa of said second tecoidipg means bei^g 
ioUbitBd to response to said out«of-sync signal. 

23. A fault toleanc computer system acoonfing to Claim 22. wherein said first 
recofdmg medsnism comprises a software genemted list of update e^ 

24. A fault tolerant computer system according to Claun 14, whettin said second 
leconfiqg mechanism comprises a table look-aade buffer and a monoiy access table 
mamcdiiBd in naain memory. 

25 . A fault tolerant con^uier system accordiqg to Clam 14, wherein said memory 
renuegration mechanimi is operative to reintt^gtase memoty pages idendfied m said 
fixst and second recordiog mechanisms. 

26. A Cault tolerant computer system according to Claim 14. coo^rising' three 
qmhronous processing sets operitii^ m lockstep, itdierein said out-of*sync detectcn^ 
demrmines an oui-of-syne prooesslQg set by majority voting. 

27. A fiudttolenmt computer system accordiag 10 Claim 26, wherein said om^ 
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syoc detector is arranged to sdect one of the remainmg two processing sets, to supply 
an input to the out-of-sync processii^ set and the irm^mng processus set to cause 
said ota-of-sync and remaining processing sets to idle, to rcintegnie one f said out- 
of-syoc and reniaining processing sets while maintaining a software log of monoiy 
write events, and then to reintegcate said other of said out-of-sync and 
processing sets usipg said software log . 

28. A method for ieint)^;nidon of a proccsang set of a fiuiit loicnnt t^ff m piyT 
system convrising a phnalilj of synchronous processii^ sets, each comprisii^ a 
processor and internal memoty and opeiating in lodcstep, and a £ault detector for 
detrcdng a faiih event and for generating a Cnili agnal. said method compduiQ: 
nuuntainiqg a tcn^>oraiy record of memory update events over a limited period: 
responding to said fauk signal to activaie a Anther record of memoiy update evems 
followii^ said ^t state; and 

perfotming memoiy reinK^tion in a processor in which a fouii has occurred for at 
leastdwsepaitsofmcnioiy identified in said tc mpor at y and further m c niot y reeotds. 

29. A metfiod according to Claim 28. wberdn said fault event is an out^-syac 
evenL 

30. A tbethod accordiiQ Co Oaim 28, wfaeran die fitnher record is stoced io a 
storage of a memoiy management tmit, a page entcy in said storage bettg provided 
for each page of memory, a code bdng written to a page entry each time that page 
b wrinen to when said first recording mechanism has been activalcd. 

31. A method accordiiig to Claim 28, comprising miintainii^ a record of recent 
memoiy update events i^ to a number sufficient to cover the time to acdvate said 
further record. 

32. A method accorctii^ to Claim 31, wherdn sakl ten^otaiy record b stoxed is 
a lirst-inrfirstKHit buffer. 

2i 



(36) 



«Ba¥10-177498 



33. A method accordii^ to Claim 32, OMiprising connecting a rccoxding 
mficbanism for die further itcordii^ to an ou^t of said fiist-in-flrst-oitt buffer. 

34. A metbod aocoiding to Claim 33, coiqiristng the steps of storins up to a 
predetennined munber of update addresses in said fiist-in-fiisi-oiit buffer, supplying 
the output of said fim-io-fiist-out buffer to an address decoder to genatc a p» cr 
sisual lepttseniaiive of a metnocy update address output firom said fiist-m-fiist-out 
buSer, and xecording saud page ugnal as pan of said further recontiis when said 
&nlt signal is active. 

35. A mediod according to Claim 31, wbcrem said temporaiy iccoid is stored in 

« 

a logic analyzer. 

36. A method accoiding to Qaim 31, coa^)risii^ the stq>s of maintaining %A1 
temporary record of recent memory update events up to a number ^vffrtfitt to cover 
the time to activate said first leqmfing raechatusm foIlowiE^ a fault, said ten^rary 
record being itdubited in reqnmse to said fank s%nal. 

37. A melfacxi according to Claim 36, comprising generatu^ a list of iqMlate events 
by software. 

38. A mrthfxl according to Claim 31, comprising fhrmii^ tald tgm p^ fyyy f>^r^ 
by means of a taUe look-aside buffer aid a memory access table mmhnn^ in main 
uicmoiy. 

39. A method according to Claim 31, conftrisii^ the st^ of leintegrtfing notemoiy 
pages identified in said temporary and further records. 

40. A method according to Qaim 31, compristtig three synchronous processing 
sets operating in lodcstep, wherein an out-of-sync detector detennines an out-crf-sync 
processing set by instjdrity voting. 
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41. AipeQipdaooaniingtoCMin3I> ownprining die steps of, in response to tbe 
olcxxificJtion of an ont-of-syBC processing set, selects^ ok of die rmwWng two 
pvocessiQs sets, svp^lng an futenupl to die M^-^fsc proeesdqg set and die 
lonalnsns ptixjc ssu ig act to cmbc said oat-of-sync and ranaming processing $e» to 
idle, mitfcgnnng cuk of taxi out-of-sync and lananung p ro cessi n g sets wAiic 
Tn^gf^inSi^ ^Mfeii^arelogofnienwrywrte 

of said ont-of-sfac anA xenaJmng processiog sets nsiog said saftaraie Ipg. 
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AgyiTRACT OF THI- n^rr ^^^mi: 

A menwiy n»nag«aem system for a feult toleiam compottr lystem include, 
a icconllpg n^chaiUsm (25) wluch can be ,«iva^ 
(iwite) evoft. a second rtconUiS mectaai^ 

a l«uted number of «emo«y uptoe events, a ftult input for a fenlt signal to actim: 
the fi«t feeoldi^g o^chanian in the event of a ftuU (outntf-sync) event and a 
m««,i, reintegration mechanian (27) reinttgiate at least parts of n«nory 
ito^ffied in tte first «K1 second recording Tl« i««ling of «e«o«y 

uixbtes is preferably doaeoaawteis«ti.,adirv„».„d«^ 

«conl. K«>yety from, minor outH>f-^ac event between processing sets in a 
lockstep system can be «iievtd rapidly and efficiently ^ 
"tentrfKd in d« r«t and second «»rding mcdu^ 
syncprocessingsetaso«lya«lativelys„^„„^„f^^^^,^^ 

systeu of dtter tte out-o^sync or d« nmning processi,^ set WUI ^ 
modified. 



